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ABSTRACT 

We consider a wide range of parametric mass models for B 1933+503, a ten-image 
radio lens, and identify shared properties of the models with the best fits. The 
approximate rotation curves varies by less than 8.5% from the average value between 
the innermost and the outermost image (1.5/i~^ kpc to 4.1/i~^ kpc) for models within 
la of the best fit, and the radial dependence of the shear strength and angle also have 
common behavior for the best models. The time delay between images 1 and 6, the 
longest delay between the radio cores, is At = (10.6t^;f days {Uq = 0.3, Aq = 0.7) 
including all the modeling uncertainties. Deeper infrared observations, to more 
precisely register the lens galaxy with the radio images and to measure the properties 
of the Einstein ring image of the radio source's host galaxy, would significantly improve 
the model constraints and further reduce the uncertainties in the mass distribution 
and time delay. 



Subject headings: cosmology: gravitational lensing 



1. Introduction 



Strong gravitational lenses can aid in the determination of two astrophysical quantities of 
great interest: the density profile of massive galaxies and the value of the Hubble constant. 

The density profile of massive galaxies is not known. Massive early-type and late-type galaxies 
appear to have mass distributions consistent with flat rotation curves (total mass M oc r, density 
p oc 1/r^) on scales outside their cores and inside their outer edges. This is most cleanly shown 
by spiral galaxy rotation curves (e.g. Rubin, Ford & Thonnard 1978, 19801 ), elliptical galaxy 
X-ray halos (Fabbiano |1989| ), and some gravitational lenses (e.g., Kochanek 1995). It is also 



supported by modern stellar dynamical studies of early-type galaxies (e.g., Rix et al. 1997). The 
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central properties of galaxies are characterized by density cusps, rather than finite core radii. In 
particular, the central luminosity profiles of massive galaxies generically show density cusps with 
p oc r~'^ and 1 ^ 7 ^ 2 (e.g. Faber et al. 1997] ) . Halos in dark matter simulations, which appear 
to form with 7 
et al. 



1 — 1.5 cusps (e.g. Navarro, Frenk &: White 1997, Klypin et al. 1999, Moore 
1998| ), suggest that the dark matter cores should have similar properties. However, it is 



expected that the dark matter profile will be significantly modified by the subsequent evolution 



and interaction with the baryons (e.g. Blumenthal et al. |l98q , Dubinski |1994| , Navarro, Eke & 
Frenk 1996 , Gelato & Sommer-Larsen 1999| , Binney, Gerhard & Silk 2000 ). Currently popular 
scenarios for self-interacting dark matter (SIDM, see e.g. Spergel & Steinhardt pOOOj ; Kochanek 
&: White 200C| ) further complicate predictions for the central density structure of galaxies. The 
resulting central density profile of the dark matter in massive galaxies is not known. Even less is 
known about the angular shapes of galaxy mass distributions, although spiral galaxies are clearly 



far rounder than their luminous disks (see Sackett 1995 for a review) and ellipticals can be either 



flatter or rounder than their stellar profile (see Buote & Canizares 1997 and references therein) 



and are modestly triaxial (e.g. Franx, Illingworth & de Zeeuw 1991). See, e.g., Rusin and Tegmark 



(2000) for a recent summary and more references. 



Multiple-image gravitational lenses offer a promising, newer probe of galaxy mass distributions. 
The image geometry usually constrains the mass inside the critical line of the lens to an accuracy 
exceeding that of any other methodQ. Unlike other probes of galaxy mass distributions, lenses 
also measure the shape of the gravitational potential near the critical line accurately. In most 
systems, however, this well measured quantity includes contributions from the lens galaxy and 
tidal perturbations (either local to the lens or along the line of sight) which can be difficult 
to disentangle (see e.g. Bar-Kana 1996, Keeton, Kochanek & Seljak p"997| ). In the majority of 
lenses, our ability to measure the radial or angular mass distribution is limited by the paucity of 
constraints supplied by the small numbers of images (2 or 4). There are, however, many lenses 
with either a larger number of discrete images or continuous arcs or rings formed from the host 
galaxy or extended radio emission regions. These are the systems best used to probe the details of 
galaxy mass distributions using gravitational lensing. 

Gravitational lenses are also a promising method for determining the global value of the 
Hubble constant by measuring the light propagation delays between the images (Refsdal |1964|) . 
Although time delays have been measured for six gravitational lenses, the inferred values for the 
Hubble constant are limited by the systematic uncertainties in the mass models for the systems 
(see, e.g. Impey et al. [1998 , Koopmans &: Fassnacht 1999 , Williams & Saha 2000| , Keeton et al. 
|2000 for recent examples). The uncertainties can be reduced by finding additional constraints on 
the systems with delays (see Kochanek, Keeton & McLeod 2000| ), including external constraints on 



^The accuracy is almost always better than < 5% including systematics such as the cosmological model. As is 
true for all mass estimation methods, there is a global scaling with the Hubble constant (distance), which for the 
lenses is simply that the mass is proportional to . 
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the models, or by focusing future time delay studies on lenses with fewer modeling uncertainties. 

Thus, the utility of strong gravitational lensing is maximized in systems with large numbers 
of constraints. The focus of this paper is one such system, the 10 image lens B 1933+503 (Sykes 
et al. pJ98t Marlow et al. pJ99t Chapman et al. Browne et al. \[99% Biggs et al. pOO0| , 

Norbury et al. 2000 ). B 1933+503 consists of two quadruply imaged radio sources and one doubly 
imaged radio source produced hy a zi = 0.76 lens galaxy. The source redshift is Zs = 2.62 (Biggs 
et al |2000| , Norbury et al. |2000D . The images range from unresolved to a pair of thin arcs (see the 
schematic in Figure 1). 




Fig. 1. — The geometry of B 1933+503. Images 1, 3, 4 and 6 are four images of the compact 
flat spectrum core. The peaks of images 7 and 5 form a two image system, while their wings 
combined with images 2a and 2b form a four image system. Finally, images la and 8 form a two- 
image system. The positions of the images are shown by their error ellipses, where images 5 and 7 
have small uncertainties for their two-image component and large uncertainties for their four-image 
component. The squares show the image positions found for the best SIE model of the system (see 
text). The critical curve for the model is the large ellipse in lighter type. 

Nair ( |1998D showed that a simple singular isothermal ellipsoid (SIE) could fit the available 
constraints (see also Chapman et al. [1998 for some discussion of modeling). In this paper we 



extend her analysis to a wide range of parametric mass distributions to explore the ways in which 
the complicated image pattern restricts the mass distribution. From this ensemble of models, we 
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identify common features of the acceptable models, such as the shape of the mass profile/rotation 
curve, and estimate the uncertainties in the image time delays for use in determining the Hubble 
constant. In §2 we review the necessary mathematics of lensing and describe the available data. In 
§3 we catalog the models we use and their results. We summarize the results and suggest further 
directions in §4. 



2. Models 

In this section we review the mathematical background and notation for our models (§2.1), 
discuss the mass distributions we use (§2.2) and the available data on B 1933+503 (§2.3). 



2.1. Notation 

We consider a thin lens characterized by a projected lens potential 0(x) which is a function 
of a Cartesian angular coordinate x. The potential is related to the surface mass density by 
V^^(x) = 2k(x) where k(x) is the surface density in units of the critical surface density. In 
addition to the primary lens, we include an external tidal field defined by an amplitude 7 and a 
orientation 9^. For a source at angular position u, the images are found at solutions Xj of the lens 
equation 

u = X - V(/)(x) - r • X (1) 



where the shear tensor is 



cos 29^ sin 29^ 
sin 29^ — cos 29^ 



(2) 



The shear term models not only local tidal perturbations to the lens but many of the additional 
perturbations along the line of sight (see Kovner 1987 , Bar-Kana |1996| ), and it is a necessary 



component of any realistic lens model (see Keeton, Kochanek & Seljak 1997). The inverse 
magnification tensor of the model is 



M-' = I-T-\ ^'"^ . (3) 



and the magnification of an unresolved image is given by det M. Equation (|^) can also be written 
as the condition for an extremum on a time delay surface, and the surface itself determines the 




time delay (see Schneider, Ehlers & Falco 1992). 



The procedures for modeling the positions of lensed point sources are simple to describe 
and relatively easy to implement. Equation (^) can be read as a map from the image plane to 
the source plane via a lens model (potential (f) and external shear F). The calculations in this 
paper used the lensmodel code (Keeton pOOOaD . For observed image positions Xj and model image 
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positions x^°'^(u) the goodness of fit is measured with a simple ^ statistic 

A,point / , 2 \ ' 



C7 



with (in this case) isotropic positional uncertainties cxj for each image. The generalization to 
anisotropic positional uncertainties, such as will be used here for the well as fluxes, is 

immediate. The statistic is minimized with respect to the unobserved source position, u, and 
the parameters of the lens model, where the model parameters may be further constrained by 
measurements of the properties of the lens (position, shape, orientation . . .). For non-precision 
work the problem can be further simplified by simply minimizing the projected separations of the 
images on the source plane with various flux, magnification or magnification tensor weightings to 



better mimic the correct image plane statistic (see Kochanek 1991 ). Most methods using source 
plane fitting methods are biased to find solutions with high total magnifications and steep radial 
density profiles .0 



2.2. Parametrized Models 

We model the mass distribution of the lens using a wide variety of parametrized ellipsoidal 
density distributions. Parametric models of galaxies have the advantages of being simple and 
physically motivated. The disadvantage of parametric models is that they do not span the 
complete space of physically realizable mass distributions. We have chosen a set of parametric 
models broad enough to encompass the range of radial mass distributions considered to represent 
galaxies. We have, however, restricted the models to ellipsoidal surface densities in an external 
shear field and thus use a more restrictive range of angular structures than is in principle possible 
for a real galaxy. The various model families will be discussed in this section and the formulae for 
their surface densities can be found in the Appendix. 

Before discussing the parameterized models, it should be noted that there are non-parametric 
means of modeling gravitational lenses, principally the linear programming method of Saha &: 
Williams ( |1997| ), where a gridded, positive definite surface density distribution is constructed 
to fit the data. Such models avoid the overly restrictive assumptions of parametric models 
and allow elegant solution algorithms for the problem. Coupled with these advantages are two 
caveats. First, the bulk of the mass in a galaxy, particularly the early- type galaxies that dominate 
the lens sample, must actually be composed of stars and dark matter with a positive definite, 
quasi-equilibrium phase space distribution function. This is more restrictive than a positive 



^ Roughly writing Xi — x™°'' — 5x,; for the difference between model and measured image positions Xi, and 
u — u™°'' = 5\i one has (from equations 1 and 3) 5u. = MTT^Jxi. So if one minimizes |5up/a"'^ instead of |(5x|^/(t^, 
increasing M, the magnification, will tend to improve the fit. The corresponding radial profile will tend to be steeper 
because steeper profiles generally have higher image magnifications. 
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definite density distribution, which is in turn more restrictive than its integral, a positive definite 
surface density distribution.^ Many density distributions permitted by the Saha & Williams 
(1997) method are not physically realizable configurations for galaxies. Thus, even with symmetry 



and smoothness constraints, requiring a positive definite non-parametric real space density allows 
too much freedom in the mass model. Second, the current algorithms solve the lens equations on 
the source plane which can introduce the biases mentioned above. Nonetheless, as this approach 
captures different features of the matter distribution than the one considered below, applications 
of this method to B 1933+503 would be very interesting. 

The parametric models are characterized by a lens position, x^, a mass scale, an axis ratio q (or 
ellipticity e = 1 — g), and a major axis position angle, 9. The major axis density profile is further 
specified by parameters describing the radial profile of the galaxy such as core radii, break radii 
and exponents. The models have surface density distributions K,{m) = S(m)/Sc = / dzp(x, z)/T,c, 
where p{x, z) is the density distribution and Sc is the critical surface density. The surface density 
K is a function of the ellipsoidal coordinate m? = x'"^ + y'"^ /q^, where x' and y' are Cartesian 
coordinates aligned with the major and minor axes of the lens. The precise parametric forms /t(m) 
of the surface density distributions for the models are given in the Appendix and their properties 
are discussed below. In order to simplify the comparison between models below, the discussion is 
restricted to radial properties of a spherical density /?(x, z) = p{r). 

We start with the singular isothermal ellipsoid (SIE) as a standard model. It is both 
analytically convenient (Kassiola &: Kovner 1993| , Kormann et al. 1994 , Keeton & Kochanek 1998 ) 



and the simplest mass model which is broadly consistent with our general knowledge of the mass 
distributions in observed galaxies as reviewed above. The singular isothermal sphere has a global 
/9 oc 1/r^ density profile, which means it has both a steep central density cusp and a globally flat 
rotation curve. We then explore two general families of ellipsoidal density distributions, varying in 
radial profile, that contain the SIE limit. 

The power- law models, whose spherical form is /? oc (r^ + s^)^"~^^/^, are a simple generalization 
of the isothermal spheres to include both a central core radius s and a variable logarithmic slope 
a. The isothermal sphere is the limit with s = and a = 1, and the power-law model sequence 
includes several "old-fashioned" models of galaxy luminosity profiles such as the modified Hubble 
profile (a = 0) and the Plummer model (a = —2). This model family was introduced into 
gravitational lensing because of its relative analytic simplicity and because observational evidence 
at the time suggested that galaxies possessed finite core radii (see Blandford & Kochanek 1987; 



Hinshaw & Krauss 1987). It has remained the most popular "more complicated than isothermal" 



lens model (e.g. Kochanek 1995, Grogin & Narayan 1996, Chae 1999| , Koopmans & Fassnacht 



1999) because of its history and its relative simplicity even as an ellipsoidal model (Barkana |1998 



^ In an actual phase space distribution function there may be some unoccupied orbits, i.e. there can be some 
stochasticity due to parts of the galaxy with smaller total particle numbers such as satellites and globular clusters 
which may be responsible for perturbing the image flux ratios in many lenses, see Mao & Schneider (1998). 
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Chae, Khersonsky & Turnshek 19991 ) . Unfortunately, we now know that galaxies are characterized 



by central cusp exponents rather than core radii. Thus the steeper models, a < 1 are unphysical 
while the shallower models a > 1 are approximations to realistic models with shallow density 
cusps. We include a sequence of these models with a < 1 only for historical continuity. 



The pseudo-Jaffe models are related to the Jaffe ( 1983 ) models for early- type galaxies (see 
e.g., Keeton & Kochanek 1995 and references therein). They have p oc (r^ + s^)~^(r^ + a^)~^ for a 



spherical model, and include both a central core radius s and an outer break radius a. The model 
has an approximately flat rotation curve for s ^ r ^ a, truncated by the finite central density 
inside the core radius and by the finite total mass outside the break radius. As we expect the core 
radius to be zero, we are mainly interested in the model as a simple probe of the truncation of the 
mass distribution at the break radius. The pseudo-Jaffe models match the SIE model for s = 
and a — > oo, and match the a = —1 power-law model in the limit s — > a. 

We currently lack an implementation of a family of ellipsoidal models with a variable central 
cusp exponent (the lensing equivalent of the tj models; see Dehnen 1993, Tremaine et al. |1994 ). 



The analytic lensing properties of such models were studied by Evans & Wilkinson ( [1998| ). In 
order to explore the regime of shallow density cusps we included the NFW (Navarro, Frenk 
& White 1997), de Vaucouleurs and a > 1 models in our survey. We are interested in the 



consequences of the p oc 1/r central density cusp of the NFW models, and not in its origins as the 
characteristic density profile of dark matter halos before modifications due to baryonic matter. 
The de Vaucouleurs profile also acts like a shallow, cusped density distribution {p oc r~^/^, see 
Hernquist 19901 ), as well as being the natural constant mass-to-light ratio model for gravitational 



lenses. The NFW models are characterized by a break radius a between their inner p oc 1/r 
cusp and their asymptotic p oc profile. The de Vaucouleurs model effective radius Re plays 
a physical role similar to the NFW break radius. The a > 1 power law models can model the 
central regions of galaxies with shallow cusps but cannot be good global mass models as they have 
steadily rising rotation curves even at large radii. 



2.3. Data 



We constrain the models using the relative positions of the lens galaxy and the lensed images 
and the flux ratios between the images. Here we summarize the sources for the constraints and 
their limitations. Table 1 presents the constraints and uncertainties derived from the data in 
Sykes et al. ( |1998| ), Nair ( |1998|) , Mar low et al (|1999D and Biggs et al. (|2000D . Figure 1 illustrates 
the geometry of the system. The source redshift is Zg = 2.62 (Biggs et al. 2000 , Norbury et al 



2000 ) and the lens redshift is zi = 0.76 (Sykes et al 1998 ). The available infrared and optical HST 
images are too poor to precisely align the radio maps with the HST images or to perform accurate 
astrometry or surface photometry of the lens galaxy (see Mar low et al. 1999| ). 



The lensed images (see Figure 1) are composed of three sets of multiply imaged source 
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Table 1. Image data 
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Note. — The data are derived from Sykes et al. ( |1998D , Nair ( |1998| ), Marlow et al. 
(1999), and Biggs et al. ( 20001 ). Images 5 and 7 appear twice, once as a two-image system 
and once as a four-images system with images 2a and 2b. No flux constraints were used 
in the latter case. The coordinates are centered at RA 19''34'"31.296^ Dec 50°25'22'.'519 
(J2000). The astrometric uncertainties (Major, Minor and PA) define the error ellipse, 
which is circular if there is only an entry for the Major axis. The flux ratios have signs 
corresponding to their absolute parities. The flux ratio assigned to the galaxy is the 
upper limit on the total flux of any central image and corresponds to a flux of 0.072 mJy 
in the 8 GHz Biggs et al. ( pOO0| ) maps. 
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components. There are four images of the flat spectrum core (image 1, 3, 4 and 6) which have 
very precise astrometry from the Sykes et al. (1998) VLBI observations. There are two images of 



one steeper spectrum component (images la and 8) whose astrometry is relatively poor because 
of their low fluxes and proximity to the brighter core images. There are four images of a second 
steeper spectrum component (images 2a, 2b, 5 and 7) but these cannot be naively treated as four 
images of the same source point because of their extended structure (see Nair 19981 ). In particular. 



the peaks of images 5 and 7 are only doubly imaged and do not have counterparts in images 2a 
and 2b. There is no gap in the arc formed by images 2a and 2b, so they contain only partial 
images of the source component producing images 5 and 7. Following Nair ( 1998]) we treat the 



peaks of images 5 and 7 as a two-image system. We then consider all four images 2a/2b/5/7 as a 
four-image system, using error ellipses corresponding to the major and minor axes of the images 
to compensate for our uncertainty in precisely how the extended structure should be modeled. 

The flux ratios of the images depend on both frequency and time. To derive the flux ratios 
of the images we averaged the flux ratio measurements by Sykes et al. ( [L998| ) and Biggs et al. 
(2000). The differences are probably dominated by systematic differences in the resolution and 



sampling, temporal variability, and intrinsic systematic problems with flux ratios (see Mao & 



Schneider |1998D rather than simple measurement errors. Thus, the problem with flux ratios is 
always the precision with which they should be imposed. For images 1, 3, 4 and 6, we used 50% 
errors on the fluxes because the large flux ratio between image 4 and the other images is unlikely 
to be due to the effects of lensing - generic lens configurations do not generate such flux ratio 
patterns for point images. For images la and 8 we used the reported flux errors. For images 5 and 
7 modeled as a two-image system, we used 10% flux errors (which we view as a lower bound on 
any flux ratio constraints), and for images 2a, 2b, 5 and 7 modeled as a four- image system we used 
no flux constraints because we are modeling extended rather than point images. We will explore 
the consequences of changing these estimates of the uncertainties on the results. 

No central radio image is detected in any of the radio maps, which is a powerful constraint 
on the central surface density of the lens. In our standard models we used a detection limit for 
central images of 0.072 mJy based on the 3cr detection limit in the 8 GHz observations by Biggs 
et al. (|2000| ). This is very conservative, both because we use a 3a limit as a la limit and because 
we apply it to the central image for each image system separately rather than to the sum (i.e. if a 
model produced an 0.072 mJy central image in its model for each of the la/8, 1/3/4/6 and 5/7 
image systems we will count it as a 3a detection instead of a 9a detection). We will explore the 
consequences of using a tighter 0.024 mJy (la) constraint. No position constraint was imposed on 
this undetected central image. 



3. Results 



We now describe the results of modeling the system with the density distributions discussed in 
:2.1. These consists of three discrete models (SIE, NFW and deV) and two two-parameter model 
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10"' 10"^ 10"' 0.1 1 10' 10^ 

iog(scaie) (arcsec) 



Fig. 2. — The values for the best fit models in each class ordered by their scale radii. The 
power law models (the sequence labeled an for a = n) are shown at the best fit core radius s for 
a < 1. For a > 1 where the core radius is unimportant the models are simply ofi'set horizontally 
to distinguish the points. The of the pseudo-Jaffe models are shown as a function of the outer 
break radius a (the sequence labeled pja for break radius a). The reference SIE model is shown 
at log (scaZe) = 2 because it is the limit of the pseudo-Jaffe sequence as a — >^ oo. The SIE model 
is almost identical to the a = 1 softened power law model. The best NFW and de Vaucouleurs 
(labeled deV) models are shown at the appropriate NFW break radius log (a) and de Vaucouleurs 
effective radius log(i?e)- 

families (the power law models and the pseudo-Jaffe models). All the models use ellipsoidal density 
distributions and include an independent external shear. Models without the shear terms fit the 
data so poorly that we do not discuss their properties. The necessity of an independent external 
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shear is expected (see Keeton, Kochanek & Seljak [L997 ). The external shear required varied from 
between 2% and 14 % for the best {2a models), and the galaxy positions varied by < 10 mas. 
The images will strongly constrain the mass distribution over the region containing images (from 
1.5h~^ kpc to 4.1/i~^ kpc) with the mass being most tightly constrained near the critical radius 
of 2.6/1-1 kpc, where I'.'O = 5.1/1"^ kpc (for = 0.3, Aq = 0.7, and Hq = lOO/i km s'^ Mpc-^). 
Table 2 summarizes the results. 



3.1. Statistical Summary 

The data supply a total of 26 constraints, and the models have Np = 7 or 8 parameters, 
leaving us with N^of = 19 or 18 degrees of freedom. We will explore the results using standard 
results for Gaussian statistics, which we now summarize. 

1. Comparing different models: For Ndof degrees of freedom, the value of our fit statistic should 
follow a distribution with Nfioj degrees of freedom with a mean value of N^of and a 
variance of (2A'rfo/)^^^- For large N^of the distribution is asymptotically Gaussian, a limit we 
can use relatively safely. Thus, for N^of = 18 the mean and variance of the distribution 
are 18 and 6 respectively. 

2. Rescaling the statistics: The best models have — H ^or N^oj = 18 which is somewhat low 
although statistically possible (23% of models with N^of = 18 would have < H or > 25 
for Gaussian statistics). However, we may have overestimated the uncertainties or degrees of 
freedom in the model. We can adjust for this by rescaling the models to make the best fit 
model have Xmin — ^dof , which corresponds to shrinking the typical constraint uncertainties 
by 20% (an entirely plausible correction). Since we do not rescale the fit statistics, we may 
be overestimating the range of acceptable models and the uncertainties on their parameters. 

3. Parameter significance: For a multi-parameter model like the power-law models, the 
introduction of a parameter (e.g. adding a core radius to the SIE to fit the more general 
isothermal ellipsoid) is statistically significant if the change in the fit statistic with the 
addition of the parameter, |Ax^| > fumX^/^dof, exceeds the results without that parameter 
by a factor fum = 0.5, 3.0 or 8.1 for significance levels of 50%, 90% and 99% compared to 
the improvement expected for a random variable. These limits are based on the F-test for 
adding one variable to a model with N^of = 18 or 19 degrees of freedom. 

4. Parameter estimation: For a multi-parameter model, parameter ranges are estimated from 
the change in the fit statistics relative to the best model, A^^ = ~ Xmin where the 
parameter range for one variable is acceptable at the la, 90%, 2a and 99% confidence level 
for Ax^ = 1.0, 2.71, 4.00 and 6.63 respectively. Upper bounds and parameter ranges quoted 
below will be la bounds (A^^ = 1) unless otherwise noted. 
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Table 2. The Statistical Properties of The Fits 
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Note. — The model families and parameters are described in §2.2 and Appendix A. Figures 
2 and 3 compare the models graphically. The number of parameters varied is Np, leading to 
a model with N^^f degrees of freedom and goodness of fit x^- We only present representative 
pseudo-JafFe models as the best fit model corresponds to the softened isothermal ellipsoid 
(the limit a — oo). The SIE is the same as the softened isothermal model {a = 1) in the 
limit s ^ 0, and the a = — 1 power law model is the same as the pseudo-JafFe model in the 
limit that s — ^ a. 
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3.2. Model Properties 

We start with the standard SIE model with external shear, which has x^/-^do/ = 12.0/19. 
The model results are compared to the constraints in Figure 1 and are remarkably good, given 
the incredible simplicity of the model. The SIE model is a particular example of both the power 
law and pseudo-Jaffe two-parameter models, which we consider next. Adding a core radius to the 
SIE, making a general isothermal ellipsoid, slightly improves the fit to x^/Ndof = 11.4/18 but 
the improvement is not statistically significant (66% for the F-test). The formal core radius is 
~ 140/i~^ pc, but the more useful result is an upper limit of ~ 260/i~^ pc. The fit is not improved 
by adding an outer break radius a to the isothermal density profile (the pseudo-Jaffe models). 
The value of monotonically rises as the break radius is reduced from the SIE limit (a — > oo), 
leading to a lower bound of a > 1'.'4 ~ 7.1h~^ kpc. The core radius of the pseudo-Jaffe model plays 
the same role as that of the isothermal ellipsoid - a small core radius slightly, but insignificantly, 
improves the fit. 

The traditionally favored power-law models, with a < 1, a finite core radius, and a more 
rapidly declining asymptotic density distribution than the isothermal models, have increasingly 
large values for x"^ as the mass distribution becomes more centrally concentrated (see Figure 2). 
These models require large core radii to fit the data (e.g. (653^gy)/i^^ pc for the modified Hubble 
profile, see Figure 3). In addition, for a < models, external shears of > 20% are required, 
increasing as a decreases. These large external shears make the interpretation of the parameters 
questionable-consequently we have cut off our analysis at the a = —1 model (shear ~ 23%); 
the a < —1 models have x"^ > 35 and external shears of > 27%. The power law models are 
consequently unacceptable unless nearly isothermal (a ~ 1), as we might expect from our current 
understanding of the structure of galaxies (see §1). The poor fit of the models is not solely due to 
the restriction on the visibility of the central image, as these models remain unacceptable when 
we weaken or eliminate the constraints on the image flux ratios (see below). 

The fit does improve significantly for power-law models with a > 1, where the models are 
mimicking galaxies with softer central density cusps than the SIE model. For these models the 
core radius must be small, and for simplicity we only show the results with the core radius fixed.0 
The best fit is found for a = 1.14:_q\j which corresponds to a density cusp with p oc r ' -f i^ 
The exponent a, unlike the core radius s, is a statistically significant new parameter (82% for the 
F-test). Unfortunately, the a > 1 models are poor global models because of their steadily rising 
rotation curves. 

The NFW and de Vaucouleurs (deV) models provide examples of models with shallow central 
cusps (near p oc 1/r) and are reasonable global models for the mass distribution. (The a > 1 
models cannot produce such shallow cusps because as q ^ 2 they become constant surface density 



^For example, the a — 1.1 power-law model has the same for optimized core radius (0'.'0002) and for the 
core radius fixed as in Table 2. 
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sheets rather than p oc 1/r cusps because of how the hmits are ordered.) Figure 3 shows the 
variation in with the de Vaucouleurs effective radius Re, the NFW break radius a, and, for 
comparison, the core radius s of the isothermal elhpsoid (a = 1) and the pseudo-Jaffe model 
break radius a. The NFW model is barely acceptable at 2a significance, with a break radius 
of a = (1.53l[j:i^)/i-i kpc. The more cuspy de Vaucouleurs models produce acceptable fits for 
an effective radius of Re = (14^4)/i^^ kpc. This is much larger than is realistic for a constant 
mass-to-light ratio model of the lens galaxy.^ 

The uncertainties we used for the image flux ratios are somewhat arbitrary (see §2.3), so 
we explored the effects of changing these limits. We recalculated all the models with 50% flux 
uncertainties for all images {Xmin/^dof = 9.0/19), no flux constraints at all {Xmin/^dof = 6.7/10), 
and using the larger of the quoted errors in the flux measurements or 10% of the flux 
iXmin/^dof = 61.5/19). In all three of these cases the best-fit model was the SIE, or equivalently 
the a = 1 isothermal ellipsoid with the statistically insignificant addition of a core radius. 
The a = 1.14 best-fit model for our standard flux ratio uncertainties remains within la of the 
best fitting models for two cases, and had a Ax^ = 1.4 for the case where 10% flux limits are 
used. In the case where no flux limits are imposed the a < 1 models are still disfavored, with 
a > 0.5(lcr), — 0.4(2fj). The overall statistical ordering of the models is essentially unaffected by 
changes in the choice for the flux ratio uncertainties. We also explored effects of changing the 
limits on the flux of any central images, discussed below. 

The central cusp exponent or core radius of the distribution determines the visibility of the 



central or "odd" image (Narasimha, Subramanian & Chitre 1986 ), whose flux decreases as the 
central surface density or concentration of the lens increases. Thus, the limits on the core radius 
depend on the treatment of the central image. Figure 4 illustrates the consequences of tightening 
the limit on the flux of central images from 0.072 mJy (3o" in the 8 GHz map) to 0.024 mJy (la) 
for the isothermal ellipsoid (a = 1), the a = —1 power-law, and the de Vaucouleurs models. The 
scale radii are forced to decrease, and for the a = —1 power-law model and the de Vaucouleurs 
model the statistics rise significantly. The change in for models with low is generally 
small (Ax^ < 1). Mao, Witt &; Koopmans (2000) explored the effects of adding a central black 
hole to the lens galaxy as an alternate means of suppressing the flux of central images. While we 
do not expect the addition of a central black hole to modify the global trends discussed below 
(as model comparisons were essentially unchanged when the flux constraints were completely 
removed), it would be interesting to include central black holes in future analysis. 

Finally, we explored the effects of further generalizing the models. In particular we were 
concerned that the limitation to ellipsoidal models might force an undesirable correlation between 
the monopole and quadrupole structures of the lens galaxy. We tested this by modeling the galaxy 
as a combination of two pseudo-Jaffe models where the inner cutoff radius of one equaled the 



^While the HST images are not adequate to fit a photometric model, the optical and IR emission is confined to a 
region comparable in size to the lens, so a real constant mass-to-light ratio model would have Re < I'.'O {5h~^ kpc). 
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outer core radius of the other (see Keeton et al. 2000| ) and then optimized the relative masses, 
elhpticities and orientations of the two components. The best fit models had ~ 10 — 11 and 
2 or 3 fewer degrees of freedom, so the improvements in the fit were not statistically significant 
given the reduction in the number of degrees of freedom. Modeling this system as a combination 
of two other models (Keeton |2000b| ) produced similar behavior in the monopole and quadrupole 
structures discussed below. 



3.3. Global trends 

Parametric models make it difficult to see the global trends implied by the range of acceptable 
models or to compare the physical properties of the individual models. We can illustrate the 
similarities and differences of the models by looking at the monopole and quadrupole deflection 



profiles of the models (see e.g. Kochanek 1991 for definitions). For a potential such that 
V^(/> = 2Av(r, 9) = 2S(r, 9) /Tie, the monopole deflection of the lens is 

ao(r) = - r r'dr' ^ d9V^{r) = - f r'dr' ^ d9^^ = M^illl (5) 
r Jo Jo r Jo Jo 

where M{< r) is the mass enclosed by the circle and the coordinates are centered on the lens 
galaxy. The rotation curve of the galaxy is roughly proportional to the square root of the deflection 
monopole (projection effects modify this slightly). Generic deflection or rotation curve profiles 
have three radial regimes. At small radii there is a core region where the deflection can go to zero 
either because of a finite core radius or the presence of a shallow density cusp. The size of the 
region is set either by the core radius or the break radius where the steep outer density cusp is 
matched to the shallower inner cusp. In the inner region the rotation curve rises, and can become 
flat for a while. At large radius the distribution must have a break beyond which the rotation 
curve becomes Keplerian (or near Keplerian for the NFW model). 

Figure 5 shows the monopole as a function of radius for the models from Table 2. As is 



typical of lens models (see Kochanek 1991), all models agree very precisely on the mass enclosed 
by the mean critical line of the lens. For a critical surface density of Sc = 0.62h g cm~^ (Jig = 0-3, 
Aq = 0.7), the mass inside r = O'.'S is 6.0 x lO^^/i^^M© independent of the parametric model. 
The statistically acceptable models have slightly rising or falling deflection profiles, and since 
ao oc the rotation curve of the acceptable models varies by < 8.5% from the average between 
the inner and outer images {1.7h~^ kpc to 4.1/i~^ kpc). The model with the least flat rotation 
curve {a = 1.3) has a core of less than O'.'OOOl, and so projection effects should not alter this limit 
significantly (for the spherically symmetric case the change is less than 0.01%). The models with a 
slightly falling rotation curve are the pseudo-Jaffe models, and the models with the rising rotation 
curves are the a > 1 shallow-cusp power law models. The worst fitting models from Table 1 tend 
to have rapidly rising and then falling rotation curves as shown on the right in figure 5. 



The second simple physical property of the models we can compare is the quadrupole of the 
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lens. We define the quadrupole by the appropriate multipole moment of the angular deflection, 



7 cos 26^ 
7 sin 26-. 



d_ 



- sin 26 
cos 26 



(6) 



including both the lens galaxy and the external shear in the potential. We have defined the 
quadrupole such that the quadrupole of an SIE model is proportional to and the quadrupole 
of an external shear is independent of radius. An external shear (7/2) cos(2^ — 26,y) produces a 
quadrupole moment 7 at orientation 6-y. Figure 6 shows the amplitude and major axis PA of the 
quadrupole moment as a function of radius for the statistically acceptable models]^ The measured 
PA of the major axis of the lens, —40° it 5° (Sykes et al 199^ ) is shown for comparison. All the 



sucessful models show a similar decline and flattening of the strength of the quadrupole combined 
with a small ~ 3° swing in orientation. The poorly fitting power law models also have very 
different quadrupole profiles from the acceptable models. When we tested the composite models, 
where the properties of the monopole and quadrupole are largely decoupled, the monopole and 
quadrupole deflection profiles of the best two component models were little changed from the 
acceptable single profile models. Some examples are included in Figure 6. 

Although B 1933+507 has yet to show the variability needed to measure a time delay (Biggs 



et al. 2000), the constraints on the mass distribution are significantly better than for the lenses 
with delay measurements. Figure 7 shows the variation in the predicted longest delay between 
the compact radio components (between images 1 and 6) as a function of the statistic for the 
various models. The delay depends on the outer cutoff of the mass distribution, with the delay 
rising as the mass distribution is truncated at smaller radii and the depth of the central potential 



decreases. This is a common feature of lens time delays (see Keeton Sz Kochanek 1997, Impey 
et al. pJ98t Koopmans & Fassnacht |l99|, Witt, Mao & Keeton pOOOl) . Formally we find a delay 
range of {W.6tfj)h-^ days. As seems to be typical of lens systems (see Keeton & Kochanek |1997| , 
Impey et al. 1998|) , the ratios of the delays between the images depend weakly on the model. 
For example, the ratio of the delay between images 1 and 4 to the delay between images 1 and 6 
changes by only ~ 3% (< 0.4/i"^ days) for the range of the statistically acceptable models. Models 
with central cusps which are more realistic than the a > 1 power law models will have an outer 
break radius similar to the pseudo-Jaffe models. As the models are varied by decreasing the break 
radius the time delays will rise. 



4. Conclusions and Prospects 

After fitting a wide range of parameterized, ellipsoidal mass distributions to the B 1933+503 
system, we conclude that the mass distribution must have an approximately flat rotation curve 



^Note that the PA is related to {x,y) via x — —rs'mO, y = rcos9, while the angles in the equations obey the 
more standard definition x — rcosO, y = rsinO. 
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with a cuspy central density distribution. Over the range spanned by our acceptable models the 
rotation curve deviates from the average by less than ~8.5% between the inner and outer images 
{1.7h~^ kpc to A.lh~^ kpc). The central cusp needs to be quasi-isothermal {p oc r~^) and models 
with shallow cusps (/o oc 1 /r) fail to fit the data. Traditional power law models with finite central 
core radii are inconsistent with the data, as we would expect from our current understanding of 
galaxy mass distributions. Both the monopole and quadrupole deflection profiles of the acceptable 
models are similar and stable to increasing the complexity of the models further. The historically 
popular power-law models for gravitational lenses should be abandoned in favor of the lensing 
equivalents of the //-models (Dehnen |1995 , Tremaine et al. 1994| ). These models are parametrized 



by a central cusp exponent and break radius between an interior cusp and a more rapidly declining 
outer density profile, rather than a core radius and the slope of the asymptotic density profile. 

So far, all lenses which have been analyzed to determine the radial mass distribution of 
the lens prefer mass distributions corresponding to nearly fiat rotation curves. This is true of 
MG 1654-^1346 (Kochanek [1995D , MG 1131-^0456 (Chen et al. [1995D , Q 0957-^561 (Keeton et 



al. |2000D , PG 1115-F080 (Kochanek et al. pOOC| ) and B 1933-^507 (this paper). No lens with 



an image distribution suitable for determining the radial mass profile has ever found results 
which are grossly inconsistent with a nearly fiat rotation curve. Similar results are also found for 
observations of nearby early-type galaxies with X-ray halos (Fabbiano |1989| ) and modern stellar 



dynamical studies (e.g. Rix et al. |1997[) . 

The predicted time delays of B 1933-1-503 are relatively well constrained at (10.6l;^;^)/i~^ days 
between images 1 and 6 over the wide range of possible mass distributions we explored. A ~22% 
modeling uncertainty is encouraging because there are excellent prospects for further improving 
the model constraints. Improved radio maps with a better quantitative understanding of the 
mapping between images 2a, 2b, 5 and 7, astrometry of images la and 8, and tighter constraints 
on the fiux of any central images would all be useful. The time delay ratios between the images 
can constrain the models if measured to 3% or better. Deeper infrared images of the system, 
particularly with the Hubble Space Telescope, could enormously improve the relative astrometry 
between the lens galaxy and the radio sources and accurately determine the structure of the lens 
galaxy (ellipticity, orientation • • •). Most importantly, the existing HST images strongly suggest 
the presence of lensed images of the radio source host galaxy. Deep imaging to measure the shape 
of the Einstein ring image of the host would dramatically improve the constraints (see Kochanek, 
Keeton & McLeod pOO0| ). 
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A. Summary of Models 

Here we present the analytic expressions for the model surface densities in terms of the 
ellipsoidal coordinate m, defined as = x'"^ + y'^/g^, where x' and y' are Cartesian coordinates 
aligned with the major and minor axes of the lens. The singular isothermal ellipsoid (SIE) is 
defined by 

At57fi(m) = ^-^ SIE, (Al) 

2 \m\ 

and the power-law models are defined by 

Ka(Tn) = if' + rr?y^'^ ^ power law . (A2) 

The power-law model with a = 1 and s = is the SIE model. The pseudo-Jaffe model is the 
difference of two softened isothermal ellipsoids (a = 1 with a finite core radius), where 



Kpj{m) = -b 



1. f / o o\-V2 / n , o\-l/2' 



pseudo — Jaffe (A3) 



with s < a. The core radius of the profile is s and the break radius is a. It matches the SIE model 
for s — >^ and a — > oo, the general a = 1 power-law model for a — >^ oo and the a = —1 power law 
model in the limit that s ^ a with b{a — s) kept constant. The de Vaucouleurs model is defined by 



where 



deVaucouleurs (A4) 

A; = 7.67 and AT = / ve-^"" dv = 1.683 x 10'^ (A5) 
Jo 

and the NEW model by 

KNFwim) = 2ks , , ' I NEW (A6) 
{m/aY — 1 

where 

} ^ iarT^ \/ / — 1 m/a>l 



1 

1 mja = 1 



J^(m/a) = { , \ tanh-Vl-m2/a2 m/a < 1 (A7) 
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Fig. 3.— (Left) The for the power law models as a function of the core radius s. Results 

arc shown for fixed exponents of a = 1.0 (the softened isothermal model), 0.5, 0.0 (the modified 
Hubble model), —0.5, —1.0 going from left to right at the minima. (Right) The a-s a function of 
the NFW break radius a and the de Vaucouleurs effective radius Re as compared to the softened 
isothermal core radius s and the pseudo-Jaffe outer break radius a. 
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Fig. 4. — The effect of stronger limits on the flux of any central images. We show the for the 
softened isothermal model, the a = —1 power-law model and the de Vaucouleurs model using either 
our standard, weak {3a, 0.072 mJy) constraint (solid lines) or a stronger (la, 0.024 mJy) constraint 
(dashed lines) on the flux of any central images. For model requiring finite core radii the stronger 
constraint forces the to increase and the best fit core radius to decrease. 
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Fig. 5. — The monopole deflection of acceptable (left) and unacceptable (right) mass models. The 
monopole deflection is proportional to v'^ ~ M{< r)/r where M(< r) is the mass enclosed by a 
circular aperture of radius r. The circular velocity is roughly proportional to the square root of the 
deflection, (M(< r)/r)^/^. The models with acceptable fits have < Xmin~^^ the models with 
unacceptable fits have > Xmin + ^- '^^^ types distinguish between models in different 
ranges. The triangles below the curve show the radial positions of the 10 images. The constraints 
on the mass distribution are strongest in the annulus containing the images. 
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Fig. 6. — The magnitude 7 (top) and orientation 9^ (bottom) of the quadrupole of the deflection. 
Note that here, unhke in the text, the angle is given in terms of position angle (i.e. x = —r sin 9, y = 
rcos9). A quadrupole of a simple external shear is illustrated by the horizontal lines in each panel. 
A range of acceptable models are shown by the light gray lines, and the heavy dark line and the 
dashed line show two unacceptable models. We include some of our hybrid two-galaxy models to 
show that the quadrupole structure is stable when we allow the mass distribution additional degrees 
of freedom for its angular structure. The position angle of the lens galaxy is shown at its measured 
value of -40 ± 5° (Sykes et al. |1998|) . 
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Fig. 7. — The dependence of the longest time delay (between images 1 and 6) on the model and 
the fit statistics . There is a well defined prediction for the time delay including the model 
uncertainties. The variation in the time delay shows the characteristic pattern that the time delay 
rises as the mass distribution becomes more centrally concentrated. Models with central cusps 
which are more realistic than the a > 1 power law models would have an outer break radius similar 
to that of the pseudo-Jaffe models. As the central break radius approached the Einstein radius of 
the lens, the time delay and (presumably) the would rise in a similar manner. 



